静电执行器为创建软机器人板提供了一种有希望的方法,因为它们的柔性外形,模块化集成和快速响应速度。但是,它们的控制需要千伏信号,并理解由板上和环境效应的力相互作用引起的复杂动力学。在这项工作中,我们演示了一个不受限制的二维五实机压电机器人,该机器人由电池和板载高压电路提供动力,并通过无线链路进行控制。可扩展的制造方法基于彼此之间的键合化层(钢箔底物,执行器,柔性电子设备)。机器人表现出一系列可控运动,包括双向爬行(高达〜0.6 cm/s),转弯和现场旋转(约1度/s)。高速视频和控制实验表明,运动的丰富性是由于机器人中不对称质量分布的相互作用以及动力学对压电驱动频率的相关依赖性。
translated by 谷歌翻译
电驱动的软机器人能够实现小型和灯体,以及环境兼容性,各种运动和安全操作。特别地,静电致动器(例如,压电致动器)快速响应。但是,可扩展的无缝集成和不可阻止操作的方法仍不清楚。此外,软体自然建模,包括环境互动,是一个长期存在的挑战。此外,需要探索更多的机器机制。在本文中,我们设计了模型,建模并展示了一个软机器人,这是第一次开始解决所有这些问题。它具有平面结构的五个执行器的线性阵列,用于集成和自由操作的开门。通过依靠姿势自我调整,设计和验证了一种新的九寸式捕获的爬行运动机制。通过实验开发并验证了包括井解释机器人运动的压电,重力和地面相互作用的第一分析软体模型。我们展示了机器人的前向和向后运动,并探索了有效载荷和驾驶速度的影响:每循环的1.2 mm运动,在移动时可以携带高达200克的有效载荷(16倍体重)。这项工作为复杂的未知环境中的快速响应机器人铺平了道路。
translated by 谷歌翻译
本文回顾了关于压缩视频质量增强质量的第一个NTIRE挑战,重点是拟议的方法和结果。在此挑战中,采用了新的大型不同视频(LDV)数据集。挑战有三个曲目。Track 1和2的目标是增强HEVC在固定QP上压缩的视频,而Track 3旨在增强X265压缩的视频,以固定的位速率压缩。此外,轨道1和3的质量提高了提高保真度(PSNR)的目标,以及提高感知质量的2个目标。这三个曲目完全吸引了482个注册。在测试阶段,分别提交了12个团队,8支球队和11支球队,分别提交了轨道1、2和3的最终结果。拟议的方法和解决方案衡量视频质量增强的最先进。挑战的首页:https://github.com/renyang-home/ntire21_venh
translated by 谷歌翻译
Global pooling is one of the most significant operations in many machine learning models and tasks, which works for information fusion and structured data (like sets and graphs) representation. However, without solid mathematical fundamentals, its practical implementations often depend on empirical mechanisms and thus lead to sub-optimal, even unsatisfactory performance. In this work, we develop a novel and generalized global pooling framework through the lens of optimal transport. The proposed framework is interpretable from the perspective of expectation-maximization. Essentially, it aims at learning an optimal transport across sample indices and feature dimensions, making the corresponding pooling operation maximize the conditional expectation of input data. We demonstrate that most existing pooling methods are equivalent to solving a regularized optimal transport (ROT) problem with different specializations, and more sophisticated pooling operations can be implemented by hierarchically solving multiple ROT problems. Making the parameters of the ROT problem learnable, we develop a family of regularized optimal transport pooling (ROTP) layers. We implement the ROTP layers as a new kind of deep implicit layer. Their model architectures correspond to different optimization algorithms. We test our ROTP layers in several representative set-level machine learning scenarios, including multi-instance learning (MIL), graph classification, graph set representation, and image classification. Experimental results show that applying our ROTP layers can reduce the difficulty of the design and selection of global pooling -- our ROTP layers may either imitate some existing global pooling methods or lead to some new pooling layers fitting data better. The code is available at \url{https://github.com/SDS-Lab/ROT-Pooling}.
translated by 谷歌翻译
While many systems have been developed to train Graph Neural Networks (GNNs), efficient model inference and evaluation remain to be addressed. For instance, using the widely adopted node-wise approach, model evaluation can account for up to 94% of the time in the end-to-end training process due to neighbor explosion, which means that a node accesses its multi-hop neighbors. On the other hand, layer-wise inference avoids the neighbor explosion problem by conducting inference layer by layer such that the nodes only need their one-hop neighbors in each layer. However, implementing layer-wise inference requires substantial engineering efforts because users need to manually decompose a GNN model into layers for computation and split workload into batches to fit into device memory. In this paper, we develop Deep Graph Inference (DGI) -- a system for easy and efficient GNN model inference, which automatically translates the training code of a GNN model for layer-wise execution. DGI is general for various GNN models and different kinds of inference requests, and supports out-of-core execution on large graphs that cannot fit in CPU memory. Experimental results show that DGI consistently outperforms layer-wise inference across different datasets and hardware settings, and the speedup can be over 1,000x.
translated by 谷歌翻译
图形神经网络(GNN)是具有无核数据的应用的有前途的方法。但是,具有数亿节点的大规模图上的培训GNN既是资源又是耗时的。与DNN不同,GNN通常具有更大的内存足迹,因此GPU内存能力和PCIE带宽是GNN培训中的主要资源瓶颈。为了解决此问题,我们提出分叉:一种图形量化方法,通过显着减少内存足迹和PCIE带宽要求来加速GNN训练,以便GNN可以充分利用GPU计算功能。我们的关键见解是,与DNN不同,GNN不太容易发生量化引起的输入特征的信息丢失。我们确定图形特征量化中的主要准确性影响因素,从理论上证明,分叉训练会收敛到网络,在该网络中,损失在未压缩网络的最佳损失的$ \ epsilon $之内。我们使用几种流行的GNN模型和数据集对分叉进行了广泛的评估,包括最大的公共图数据集MAG240M上的图形。结果表明,分叉达到30以上的压缩率,并在边际准确性损失的情况下提高了GNN训练速度200%-320%。特别是,分叉在一小时内仅使用四个GPU在MAG240M上的训练图来实现记录。
translated by 谷歌翻译
在本报告中,我们将提交的技术细节介绍给2022年Epic-Kitchens无监督的域适应性(UDA)挑战。现有的UDA方法使从源和目标域中的整个视频片段中提取的全局功能对齐,但在视频识别中遇到了功能匹配的空间冗余。通过观察到,在大多数情况下,每个视频框架中的一个小图像区域可以足以满足动作识别任务的信息,我们建议利用信息图像区域以执行有效的域名。具体而言,我们首先使用轻型CNN来提取输入两流视频帧的全局信息,并通过基于可区分的插值选择策略选择信息性的图像补丁。然后,来自视频框架的全局信息和来自图像补丁的本地信息将通过现有的视频适应方法(即TA3N)处理,以便为源域和目标域执行功能对齐。我们的方法(无模型合奏)在今年的Epic-Kitchens-100测试集中排名第四。
translated by 谷歌翻译
全球合并是许多机器学习模型和任务中最重要的操作之一,但是在实践中,其实施通常是经验的。在这项研究中,我们通过最佳运输镜头开发了一个新颖而坚实的全球合并框架。我们证明,大多数现有的全球合并方法等同于解决不平衡最佳运输(UOT)问题的一些专业。使UOT问题的参数可学习,我们在同一框架中统一了各种全局合并方法,因此,为神经网络提出了一个称为UOT-Pooling(UOTP)的广义全局池层。除了基于经典的Sinkhorn尺度算法实现UOTP层外,我们设计了一种基于Bregman ADMM算法的新模型体系结构,该体系结构具有更好的数值稳定性,并且可以更有效地重现现有的池化层。我们在几种应用程序方案中测试了UOTP层,包括多构度学习,图形分类和图像分类。我们的UOTP层可以模仿常规的全球合并层,也可以学习一些新的合并机制,从而提高性能。
translated by 谷歌翻译
Gaussian graphical models provide a powerful framework for uncovering conditional dependence relationships between sets of nodes; they have found applications in a wide variety of fields including sensor and communication networks, physics, finance, and computational biology. Often, one observes data on the nodes and the task is to learn the graph structure, or perform graphical model selection. While this is a well-studied problem with many popular techniques, there are typically three major practical challenges: i) many existing algorithms become computationally intractable in huge-data settings with tens of thousands of nodes; ii) the need for separate data-driven hyperparameter tuning considerably adds to the computational burden; iii) the statistical accuracy of selected edges often deteriorates as the dimension and/or the complexity of the underlying graph structures increase. We tackle these problems by developing the novel Minipatch Graph (MPGraph) estimator. Our approach breaks up the huge graph learning problem into many smaller problems by creating an ensemble of tiny random subsets of both the observations and the nodes, termed minipatches. We then leverage recent advances that use hard thresholding to solve the latent variable graphical model problem to consistently learn the graph on each minipatch. Our approach is computationally fast, embarrassingly parallelizable, memory efficient, and has integrated stability-based hyperparamter tuning. Additionally, we prove that under weaker assumptions than that of the Graphical Lasso, our MPGraph estimator achieves graph selection consistency. We compare our approach to state-of-the-art computational approaches for Gaussian graphical model selection including the BigQUIC algorithm, and empirically demonstrate that our approach is not only more statistically accurate but also extensively faster for huge graph learning problems.
translated by 谷歌翻译
Deep learning models can achieve high accuracy when trained on large amounts of labeled data. However, real-world scenarios often involve several challenges: Training data may become available in installments, may originate from multiple different domains, and may not contain labels for training. Certain settings, for instance medical applications, often involve further restrictions that prohibit retention of previously seen data due to privacy regulations. In this work, to address such challenges, we study unsupervised segmentation in continual learning scenarios that involve domain shift. To that end, we introduce GarDA (Generative Appearance Replay for continual Domain Adaptation), a generative-replay based approach that can adapt a segmentation model sequentially to new domains with unlabeled data. In contrast to single-step unsupervised domain adaptation (UDA), continual adaptation to a sequence of domains enables leveraging and consolidation of information from multiple domains. Unlike previous approaches in incremental UDA, our method does not require access to previously seen data, making it applicable in many practical scenarios. We evaluate GarDA on two datasets with different organs and modalities, where it substantially outperforms existing techniques.
translated by 谷歌翻译